The Annotation Process in OpenStreetMap[Not for distribution]
نویسندگان
چکیده
In this paper we describe the analysis of 25, 000 objects from the OpenStreetMap (OSM) databases of Ireland, United Kingdom, Germany, and Austria. The objects are selected as exhibiting the characteristics of “heavily edited” objects. We consider “heavily edited” objects as having 15 or more versions over the object’s lifetime. Our results indicate that there are some serious issues arising from the way contributors tag or annotate objects in OSM. Values assigned to the “name” and “highway” attributes are often subject to frequent and unexpected change. However, this “tag flip-flopping” is not found to be strongly correlated with increasing numbers of contributors. We also show problems with usage of the OSM ontology/controlled vocabularly. The majority of errors occurring were caused by contributors choosing values from the ontology “by hand” and spelling these values incorrectly. These issues could have a potentially detrimental effect on the quality of OSM data while at the same time damaging the perception of OSM in the GIS community. The current state of tagging and annotation in OSM is not perfect. We feel that the problems identified are a combination of the flexibility of the tagging process in OSM and the lack of a strict mechanism for checking adherence to the OSM ontology for specific core attributes. More studies related to comparing the names of features in OSM to recognised ground-truth datasets are required.
منابع مشابه
An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies
A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...
متن کاملTags Re-ranking Using Multi-level Features in Automatic Image Annotation
Automatic image annotation is a process in which computer systems automatically assign the textual tags related with visual content to a query image. In most cases, inappropriate tags generated by the users as well as the images without any tags among the challenges available in this field have a negative effect on the query's result. In this paper, a new method is presented for automatic image...
متن کاملTagging in Volunteered Geographic Information: An Analysis of Tagging Practices for Cities and Urban Regions in OpenStreetMap
In Volunteered Geographic Information (VGI) projects, the tagging or annotation of objects is usually performed in a flexible and non-constrained manner. Contributors to a VGI project are normally free to choose whatever tags they feel are appropriate to annotate or describe a particular geographic object or place. In OpenStreetMap (OSM), the Map Features part of the OSM Wiki serves as the de-f...
متن کاملExploting multiple heterogeneous data sets for improving geotagging quality
Geotagging is the process of associating with textual data items the geographic position they denote, usually in the form of geographical coordinates (latitude and longitude). Automatic geotagging is often trivial relying on one of the many available gazetteers, such as OpenStreetMap (OSM). However, such knowledge bases are not free of errors, and, while this simple match works for popular loca...
متن کاملRoad Detection and Semantic Segmentation without Strong Human Supervision
Recently, convolutional neural networks (CNNs) trained with strong human supervision have shown to achieve state of the art performance for both road detection and semantic segmentation. However, collecting strongly labeled data for both require detailed per-pixel annotations from humans which renders data annotation highly costly and time consuming. Therefore, in this work we propose methods t...
متن کامل